This repository has contains the sc-PBMC (Perez et al. Science 2022) pseudobulk gene expression prediction models.

This directory contains a single cell type: "PBMC", which is a pseudobulk aggreagation of the 9 cell types in Perez et al. Science 2022

Within the cell types directory there contains a summary file containing a line per gene. The summary file is ${cell_type}"_gene_summary.txt".

There also exist 4 data files for each gene. The four files are:

1: ${cell_type}"_"${gene_ensamble_id}"_cis_snp_info.txt". This file contains a line for each snp in the cis window of the corresponding gene. Each line contains the name of the snp.
2. ${cell_type}"_"${gene_ensamble_id}"_susie_alpha.txt". This file contains a line for each snp and a column for each of the susie components. Each value corresponds a posterior inclusion probability. This corresponds to the "alpha" matrix described here https://stephenslab.github.io/susieR/reference/susie_rss.html
3. ${cell_type}"_"${gene_ensamble_id}"_susie_mu.txt". This file contains a line for each snp and a column for each of the susie components. Each value corresponds the posterior means, conditional on inclusion. This corresponds to the "mu" matrix described here https://stephenslab.github.io/susieR/reference/susie_rss.html
4. ${cell_type}"_"${gene_ensamble_id}"_susie_mu2.txt". This file contains a line for each snp and a column for each of the susie components. Each value corresponds the posterior second moments, conditional on inclusion. This corresponds to the "mu2" matrix described here https://stephenslab.github.io/susieR/reference/susie_rss.html

${cell_type} is a string corresponding to the name of the cell type.
${gene_ensamble_id} is a string corresponding to the name of the gene.

